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Abstract. We present novel and sharp lower bounds for higher load 
moments in the classical problem of mapping M balls into N bins by 
(/-universal hashing, specialized to the case when M = N. As a corollary 
we prove a tight counterpart for the result about min-entropy condensers 
due to Dodis, Pietrzak and Wichs (CRYPTO’14), which has found im¬ 
portant applications in key derivation. It states that condensing k bits of 
min-entropy into a fc-bit string e-close to almost full min-entropy (pre¬ 
cisely k — loglog(l/e) bits of entropy) can be achieved by the use of 
(/-independent hashing with q = log(l/e). We prove that when given a 
source of min-entropy k and aiming at entropy loss t = loglog(l/e) — 3, 
the independence level q = (1 — o(l)) log(l/e) is necessary (for small 
values of e), which almost matches the positive result. Besides these 
asymptotic bounds, we provide clear hard bounds in terms of Bell num¬ 
bers and some numerical examples. Our technique is based on an explicit 
representation of the load moments in terms of Stirling numbers, some 
asymptotic estimates on Stirling numbers and a tricky application of the 
Paley-Zygmund inequality. 
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1 Introduction 


1.1 Universal hashing and key derivation 

Random variables £ 1 ,..., £/v are called g-wise independent if every q of them are 
fully independent, p wise independence find important applications in cryptogra¬ 
phy, for example in constructing pseudorandom generators [ 1ILL88] , oblivious 
transfer protocols [BR94], or key derivations [DPW14], In this work we focus 
on the last area, where most recent results offer a huge improvement by replac¬ 
ing general purpose randomness extractors by randomness condensers based on 
independent hashing. 


1.2 Better key derivation by independent hashing 

A min-entropy condenser is a primitive which transforms a distribution with 
some entropy (of a possibly small rate) into a distribution of almost full entropy. 
Dodis et al. prove the following theorem 

Theorem 1 (Parameters for g-universal condensers, [DPW14]). Any 

q-universal family from n to m bits is a (fc, £, e )-condenser with k = m, q = 
log(l/e), i = logg. 

Informally it states that by g-independent hashing we condense a possibly long 
string of min-entropy k into a fc-bit string 2 _l? -close (in the statistical distance) 
to a fc-bit string of entropy k — log g, which is almost full. This technical result 
is a key ingredient of their important work on key derivation. The second is the 
important observation that for a wide class of so called unpredictability applica¬ 
tions one can use a weak key with only high entropy, achieving roughly the same 
quality as with a key close to uniform. Combining these two facts they are able 
to reduce the entropy loss from 21og(l/e), offered by general purpose extrac¬ 
tors (necessary by the RT-bound [RTSOO]), to roughly l = loglog(l/e), offered 
by g-independent condensers where g = log(l/e), when the required security 
strength is e. The higher independence level g, the better security guarantees we 
get. However, too big q affects the efficiency of computations (time complexity) 
and the sampling cost (the need of longer seeds) of the hashing family. Thus the 
following question is natural 

Q: Suppose we use g-wise independent hashing to condense a source of 
min-entropy k into an m-bit key close to have min-entropy almost to. 
What is the minimal value of g? 

The following question is stated informally, depending on what we understand 
by “close” and “almost full”. Taking a positive result as a reference point, we set 
k = to and allow the derived key to be e-close to an fc-bit source of k— loglog(l/e) 
min-entropy. A key with this quality ensures total security roughly e/log(l/e) 
for any unpredictability application [DPW14]. 


1.3 Our contribution 


Summary. We show that the parameters for min-entropy condensers stated 
in Theorem 1 are essentially tight. Our approach is based on a novel anticon¬ 
centration inequality derived by a Paley-Zygmund trick, which involves higher 
moments of load in a balls-bins problem. We also use some bounds on Stirling 
numbers of second kind to simplify expressions describing load moments, into 
more compact forms. 

Moments of independent boolean sums. We state the following useful fact, 
noticing that similar observations have been already exploited by some authors 
(see for instance [ T10] for a one-sided version of this inequality). 

Proposition 1 (Explicit moments of balls-bins loads). Let S = 

be a sum of q-wise independent boolean random variables with mean 
Then we have the following identities 



( 1 ) 


j 


where S(q,j) and denote Stirling numbers of the second kind. 

Remark 1 (A balls-bins statement). Think of M balls, N bins, and indicating 
whether the i-th ball is mapped into a chosen bin. Then S is precisely the load 
of the bin. 

Anti-concentration bounds for boolean sums. Our main tool is the fol¬ 
lowing novel anti-concentration inequality for g-wise independent hashing 

Lemma 1 (Anti-concentration of balls-bins loads when M = N). Let S 

be as in Proposition 1 and q ^ 4 be an even number. Then we have the following 
inequality 



where B q are Bell numbers. 

Impossibilities for min-entropy g-wisE independent condensers. Based 
on the previous lemma and deriving some estimates on B q we finally obtain 

Theorem 2 (Impossibilities for g-universal condensers). No q-universal 
family from n to m bits can be a (k,£, e)-condenser where 


k = m 


£ = log q — log log q — log(2e) + O 



e = 2 - g (i+o(i^)) ) 








provided that k > 2 log q. Equivalently, it is not a (k,£,e)-condenser when 


k = m 

£ = log log(l/e) - log log log(l/e) - log(2e) + O ( logfl' ^f!) ^ ) 

log(l/e) • log log log(l/e) 
log log(l/e) 

provided that m > 2 loglog(l/e). These facts are even true for all flat k-sources. 

Note that the additional assumption k > 2 log q with q = log(l/e) is trivially 
satisfied for all practical applications. For recommended security e = 2“ 80 it 
becomes k ^ 13. Since k = to, it satisfied even for condensing into only m = 13 
bits! We also stress that the result in Theorem 2 is asymptotically tight, but 
better hard bounds can obtained by using Lemma 1 directly. For example, setting 
q = 64 doesn’t yield a condenser with loss £ = 2.6 and quality e = 2 -43 , whereas 
the positive result yield a condenser with £ = 6 and e = 2~ 64 . 


q = log(l/e) -w 


2 Preliminaries 

Entropy and Statistical Closeness. We say that X has k bits of en¬ 
tropy if Pr[X = x\ ^ 2~ k for every x in the range of X. Alternatively, we 
call X a fc-source. The statistical distance of X\, X 2 is defined by SD (Ad; X 2 ) = 

\ 'Yf JX |Pr[Ab = a;] — Pr[X 2 = x] |; we also say that Ad and X 2 are e-close. A n bit 
key X is e-secure if it is e-close to the uniform distribution over n-bit strings. In 
practice we think of e = 2 -80 as small enough to offer good security (indistin- 
guishably). 

Independent Hashing. A family °f functions from n to m bits is 

called g-wide independent hash family (or simply ^-universal) if for any choice of 
distinct n-bit strings X\ ,..., x q and a randomly chosen s the random variables 
h s (x \),..., h s (x q ) are independent. This concept is due to Carter and Wegman 
[CW77], 

Combinatorial Numbers. The Bell number B q counts the total number of 
partitions of a ^-element set. The Stirling number of second kind S(q,j) counts 
the number of partitions of a g-element set into precisely j blocks. 

Sources and Condensers. A distribution X is called fc-source if it has min- 
entropy at least fc. The following definition formalizes the notion of min-entropy 
condensers, whose purpose is to increase the entropy rate (density) 

Definition 1 (Min-entropy condensers). A function Cond : {0, l} n x{0, l} d —> 
{0, l} m is a (k,£,e)~condenser with a d-bit seed if for any k-source X and a 
randomly chosen s £ {0, l} d the distribution of Cond (X, s) is e-close to some 
distribution of m — £ bits of min-entropy. 




3 Proofs 


3.1 Proof of Lemma 1 

Recall the standard Paley-Zygmund inequality 


Pr[F > 9EY] ^ (1 - 9) 


2 (EP) 2 


EY 2 


, 0 < 9 <1 


( 3 ) 


valid for arbitrary non-negative Y. Setting Y = ES^ we obtain 


Pr 


s> e« 


-or 


( 4 ) 


Note that for the special case M = N by Proposition 1 we obtain 


II 1 


2—1 

M 




Since we have J]^ =1 (l ~ a i) ^ 1 — Yli—i a i ( an elementary inequality provable 
by induction), Equation (4) specializes to 


Pi¬ 


s' > 9 * 


N' 


(Bi) 

> (1 — 6») 2 (1 — ( 2 M )“ Y ) - V 2j 


B n 


( 5 ) 


Setting 0 = - we obtain 9i > 4 and (1 — 0) 2 ^ 4. 


Lemma 2 (Maximum of Stirling numbers of second kind, [RD69]). We 
have 


ma xj In S(q, j) 

q 


In q 


In In q — 1 + O 


f In In q \ 


( 6 ) 


as q —> +oo. 

By noticing that max., In S(q 1 j) < Bjq ■ max.,- In S(q,j) we obtain the following 
corollary about the growth rate of Bell numbers. 

Corollary 1 (Bounds on Bell numbers). We have 


In B n 


= In q — In In q — 1 + 0 


In In q 
In q 


(7) 


as q —► + 00 . 














3.2 Proof of Theorem 2 


Proof. Since 1 — M l q 2 ^ we need to ensure that 


M 


> 4e 


which is equivalent to 


In Bq l n B q ln(4e) 




5 9 9 

By Corollary 1, this inequality means 

lll2 jhhi) +0 fhhi,, 

9 V In q J 


( 8 ) 


which expressed in logarithms at base 2 is equivalent to 

l°g(l/e) ^ j | Q / loglog9 \ 

9 ^ V l°g 9 J ' 

By taking the inverses and using the Taylor series expansion ~ 1 — x for 
x i=s 0 we can rewrite it as 


9 

log(l/e) 


< 1-0 


/ log log 9 \ 

V l°g 9 ) 


Thus, it suffices to find possibly good q such that 


9 / log log q \ 

log(l/e) +C 'V log q J 


< 1 


(9) 


( 10 ) 


where c a positive constant (comparable up to a small constant factor to the 
absolute value of the constant hidden in Corollary 1). Take q = (1 — 7 ) log(l/e) 
where the exact value of 7 is to be determined. Since for q ^ 8 the function 
9 “^ ‘Toll q Is decreasing, we see that it suffices to satisfy 

log log log(l/e) 

7 C log log(l/e) 


and therefore we put 


_ _ log log log ( 1 /e) 

7 C log(l/e) ■ log log(l/e) ’ 


which finishes the proof if we set M = N = 2 k . 


□ 

















4 Conclusions 


It would be interesting to extend the results to settings when M > N. We leave 

this as an open problem for further research. We are also going to extend our 

methods to cover the case of almost independent hash functions which are most 

suitable in practical implementations because of much shorter seeds [?]. 
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